library(pander)
ydat <- read.csv("yDat.csv", header=TRUE)
My final guess at your model.
\[ Y_i = \beta_0 + \beta_1 X_2 + \beta_2 X_4 + \beta_3 X_9 + \beta_4 X18 + \beta_5 X_2 X_{6==4} + \epsilon_i \]
with my 95% confidence interval estimates of the coefficients as follows.
final.lm <- lm(Y ~ X2 + X4 + X9 + X18 + X2:I(X6==4), data=ydat)
mytable <- round(confint(final.lm, level=1-0.05/length(final.lm$coef)), 2)
betas <- paste0("$\\beta_", 0:(length(final.lm$coef)-1), "$")
rownames(mytable) <- betas
colnames(mytable) <- c("Lower", "Upper")
pander(mytable)
| Lower | Upper | |
|---|---|---|
| \(\beta_0\) | 1 | 1 |
| \(\beta_1\) | 1 | 1 |
| \(\beta_2\) | 1 | 1 |
| \(\beta_3\) | -7 | -7 |
| \(\beta_4\) | 6 | 6 |
| \(\beta_5\) | 1.2 | 1.2 |
Glance at the data.
X3 can be dropped because it is all a single value.
X19 can be dropped because it is related to X5, perfectly.
Otherwise, nice work. Nothing obvious to pick from. I stared for a while at this point trying to decide where to go next. Finally, I just went with makeTable9.3’s suggestions.
pairs(ydat)
ydat <- ydat[,-c(4,20)]
Here is what Table 9.3 suggested between the AIC and PRESS.
lm.1 <- lm(Y ~ X2 + X4 + X9 + X18, data=ydat)
summary(lm.1)
##
## Call:
## lm(formula = Y ~ X2 + X4 + X9 + X18, data = ydat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.28810 -0.20252 0.00871 0.18863 1.89601
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.729030 0.319165 2.284 0.0271 *
## X2 1.700774 0.115179 14.766 < 2e-16 ***
## X4 1.003762 0.005864 171.184 < 2e-16 ***
## X9 -7.017835 0.042868 -163.710 < 2e-16 ***
## X18 5.973591 0.511507 11.678 3.24e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5231 on 45 degrees of freedom
## Multiple R-squared: 0.9985, Adjusted R-squared: 0.9983
## F-statistic: 7372 on 4 and 45 DF, p-value: < 2.2e-16
par(mfrow=c(1,2))
plot(lm.1, which=1:2)
This next plot shows some exciting leads in X2 and X6.
pairs(cbind(R=lm.1$res, ydat), col=as.factor(ydat$X6==4))
lm.2 <- lm(Y ~ X2 + X4 + X9 + X18 + X2:I(X6==4), data=ydat)
par(mfrow=c(1,2))
plot(lm.2, which=1:2)
pairs(cbind(R=lm.2$res, ydat), col=as.factor(ydat$X6==4))